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Audio Recording 
Compression 
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Executive Summary 

Audio compression is used to reduce the storage 
requirements for all types of audio recordings. 
Compression is employed when storage space is limited. 
Compression is divided into two broad categories: lossy 
and lossless. Lossless compression maintains all of the 
original data. Lossy compression will discard data from 
the original. This data is lost forever. 

All audio compressions use a codec, which is a 
coder/decoder. Typically these are software based and 
they vary greatly in terms of speed, sound quality and 
how much memory space they can save. 

The chosen audio compression method is determined 
by the final user’s preferences. If sound quality 
or positive verification is paramount, such as in a 
consensual recording scenario, a lossless or uncompressed 
file would be the preferable option. Where file size is 
critical, such as in a music playback device, a lossy codec 
is preferable. 

There is a vast array of codecs available, most in a 
free format. Comparing these can be difficult for the 
average user. From an objective standpoint, codecs can 
be judged by how much the original file can be reduced 
or the time it takes to complete the process. Listening 
to the results of a codec is the true test of a codec. If the 
sound quality does not match the intended purpose, it 
should not be used._ 

If it all possible,_ 


Defining Terms 

Before discussing audio compression it is beneficial 
to explain exactly what is meant by the term. Audio 
compression is employed to reduce the file sizes for any 
audio recording, from iPods to evidentiary files. 

Audio compression can also refer to dynamic range 
compression where extremes, such as a movie passage 
containing explosions, are limited to allow playback on 
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systems with smaller ranges. Simplistically, all signals over 
a certain “volume” are reduced to a lower maximum 
level. When audio speakers produce loud distorted 
sounds, then their dynamic range has been exceeded. 

Audio data compression is all about saving memory. 
Audio quality is sacrificed in favor of enhanced 
portability or increasing the amount of time that is 
capable of being recorded. This is often at odds with 
law enforcement goals. 


Compression Basics 



Compression is used to fit more or longer files into a 
given block of memory. This is usually done to reduce 
file sizes for archiving/storage. To accomplish this goal, 
a codec is used. A codec is a combination of coder and 
decoder. This is essentially an algorithm that modifies 
the input file to make it smaller. The key is to keep the 
sound as close to the original as possible. 

Psychoacoustics 

Most codecs use what is known as psychoacoustic 
modeling. This is a branch of study that investigates what 
can be heard by the human ear under different conditions. 
The areas of study within psychoacoustics include: 
frequency cut off, threshold of hearing, and masking, 

Frequency Cut Off 

The human ear is not perfect and can hear certain 
frequencies better than others. Sounds about 20,000 
Hertz cannot be heard and can safely be discarded, 
reducing the memory required for storage. Some codecs 
discard sounds as low as 16,000 Hertz and above with 
only minor effects on audio quality. 
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SPL (in dB) 




Threshold of Hearing 

Sounds also have to be loud enough to excite the 
ear drum in the listener. There can be recording data 
that is present which would be inaudible to someone 
present. These sounds can also be discarded with little 
or no sound quality effect. However, with signal 
enhancement techniques, it might be possible to make 
the discarded data audible. This would mean that 
possible evidence is being deleted. 

Masking 


bit for bit. Lossy compression schemes discard data 
that is deemed unnecessary or redundant. The original 
recording is not completely recoverable when using any 
lossy method. 

The diagram below shows a comparison of the 
two methods and the resultant file sizes. Note that 
the lossy compressed file is smaller than lossless file. 

This is a primary reason that lossy files are preferred in 
many applications. Technically, the restored lossy file 
is actually the same size as the original, the smaller box 
here is to indicate that content is lost from the original. 





Frequency (in KHz) 


In the diagram the signal marked as the “Masker” 
would overpower the sounds marked as “Masked 
Tones”, making them inaudible. There has been a 
substantial amount of study done to determine what 
a given Masker will do to surrounding sounds. The 
loudness of the Masker has a great bearing on other 
nearby tones. More masking can be expected in 
proportion to the loudness of prominent sounds. 

An easy example to make this clear is an airplane 
passing overhead at a low altitude. If a normal 
conversation is taking place, as the plane nears, 
communication at the previous volume becomes 
impossible. Talking at the same loudness in this 
situation will be pointless. The louder sound masks the 
softer sound. 

Also, frequency has a great deal of influence here. 

A Masker at one frequency might block nearby sounds 
out, while at another frequency, nothing would be 
noticed by a listener. 

Compression Schemes 

Compression is divided into two broad categories: lossy 
and lossless. Lossless compression is similar to a Zipped 
computer file in that all original information is retained, 


All audio compression uses a codec, which is a 
coder/decoder. These can be implemented in software 
or hardware. These codecs vary widely in their speed of 
processing, quality of playback, and degree of shrinking 
the original file size. Audio codecs are preferable to 
audio/video codecs when working with audio files as 
they achieve a greater compression percentage. 

One technical issue to be aware of is bit rate. Bit rate 
refers the quantity of data allocated to a given period 
of time. The higher the bit rate, the better the sound. 
However, this will also increase the file size, which in 
some cases, is undesirable. Bit rates are typically shown 
in kilobytes per second, or kbps. 

The audio compression sche me used is completely 
dependent on the intended use. 

| Where file size is critical, such as in a mobile 
application, a lossy codec is almost always the 
preferred method. 

The current state of art in codecs is as vast as 
the different uses. Objectively, these codecs can be 
compared in two different ways: compression ratio 
and coding and decoding speed. Subjectively, codecs 
are judged on sound quality in regards to fidelity and 
faithfulness to the original. 
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Compression Ratio 

Compression ratio refers to the actual amount of 
memory saved. The higher the compression ratio, the 
smaller the encoded file. Lossy codecs have much higher 
compression ratios. This illustration gives a comparison: 
(AIFF stands for audio interchange file format and is an 
uncompressed file format.) 

Audio File Sizes 

4 minute song 

■■■■■■■■■ AIFF 90 MB 

FLAC 45 MB 

HBHHI Apple Lossless 4omb 
I I MP3 320 kbps 9 MB 
jjj MP3 190kbps 5MB 


decoding speeds might be important when listening to 
music. A user would not want to wait 5 or 6 seconds 
for a track to start after pushing play. 



As can be seen, the memory savings for the lossy 
codecs is far superior to the lossless. This also gives you 
an idea of why they are so popular with portable devices 
such as the iPod. 

Sound Quality 

Sound quality of a codec is a highly subjective topic. One 
listener might find a codec to sound fine while another 
might find it marginal. Most of the time, the major codecs 
do an effective job of replicating the original recording. 
They do not replicate high quality audio recordings, and 
there is degradation in all lossy codecs, but they would 
pass a listening test from a majority of users. An exception 
to this is a codec employed with a very low bit rate. The 
bit rate is amount of information (memory) given to each 
second of audio. Many codecs allow low bit rates and the 
resultant audio can be inferior. Using low bit rates in a 
law enforcement recording is highly discouraged. 

Coding Speed 

There are two different speeds for each codec: 
encoding and decoding. In some applications the 
encoding speed is more important. For instance, when 
ripping CDs, a faster encoding speed would enable 
the user to complete the compression faster. Playback 


Lossy Codecs 


MP3 


MP3 is by far the most 
well known and popular 
audio codec. This codec is 
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used in a wide variety of 

devices and is the de facto standard for internet audio 


downloads. MP3, which stands for Moving Picture 
Experts Group (MPEG) Audio Layer III, is not 
easily characterized. MP3 has many available bit rates 
ranging from 32 to 320 kbps. The size and quality of 
the files is directly correlated to bit rate. At 320 kbps, 
the sound quality is quite good, while at 32 kbps it is 
very poor as you might imagine. Coding speed is also 
dependent on bit rate. 


Dolby AC-3 

Dolby Digital, which 
you probably know from 
going to theatres, was first 
developed for cinemas to convert the 35 mm film print 
audio to Dolby Digital. AC-3 has 5 channels plus a 
limited frequency subwoofer channel. The bit rate is 
fixed at 320 kbits/sec and audio quality is quite good. 




Law enforcement O c mitiw Informa t ion For Official Use Only (l[3/r0U0) Ralisilusuit AuLI i uiUuU by FBI Only 


Windows Media Audio (WMA) 

WMA was designed in an attempt to 
compete with MP3 and RealAudio 
(RealAudio is the popular streaming 
software which uses many different 
codecs). It is standard in any Windows WMA 
software and delivers acceptable 
audio. There are four different WMA 
codecs including WMA Pro, which is the latest iteration. 
The speed of the codec is dependent on the version used. 
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MPEG-4 

WPEG-4 is a method 
of defining how to 
compress audio and 
video. There are 

numerous standards documents detailing the specifics. 
MEPG-4 is used by DirecTV satellite service. There 
are different versions of MPEG-4, but they are all 
considered to give average to excellent audio and 
video. Decoding speed is determined by settings 
chosen by the user. 


Lossless Codecs 

Waveform Audio File Format (WAV) 

WAV compression is a Microsoft and IBM codec. 
These files are large and infrequently used for file 
sharing. This codec is used widely where file space and 
time of compression are not a concern. These files have 
a maximum size of 4 Gbytes which is about 6.8 hours 
of high quality stereo audio. 

Apple Lossless Audio Codec (ALAC) 


effective and can achieve nearly 40-50% compression 
ratios. FLAC has one of the fastest decoding speeds 
which is ideal for music playback. It also allows 
metadata. 


Meridian Lossless Packing (MLP) 

This proprietary codec is 
very flexible and allows up 
to 8 channels of audio. It 
is used in DVD-Audio and 
Dolby TrueHD which is a Blu-ray audio format. The 
compression ratio is typically 33%. 
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% Apple Lossless 

Apple developed this lossless codec in 2004 and kept it 
proprietary until 2011. This codec achieves a 40-60% 
compression ratio. Compared to other codecs, it is 
less power hungry when decoding. ALAC also, unlike 
WAV files, allows for metadata to be added, such as 
dates and pictures. 


Free Lossless Audio Codec (FLAC) 


This codec is very popular for 
storing compact disks to a hard 
drive. This compression is quite 
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